AMENDMENTS TO THE CLAIMS 

The following listing of claims will replace all prior versions and listings of claims 
in the application. 

Listing Of Claims 

1 . (Currently Amended) A system for tuning the-a_text-to-speech conversion 
process, the system comprising: 

a text-to-speech engine, said text-to-speech engine receiving at least one text- 
input and converting said text-input into a processed representation, said processed 
representation including at least one segment and a word-boundary each associated 
with at least one speech feature, wherein said at least one speech feature of said word- 
boundary includes boundary strength and pause duration; and 

a visual editing interface displaying said processed representation using at least 
one graphical indicator including displayed segments and displayed boundaries on an 
output device, wherein said visual editing interface displays and allows editing of said at 
least one speech feature of said word-boundary by editing (a) a displayed boundary and 
(b) spacing between a displayed segment and said displayed boundary. 

2. (Previously Presented) The system of Claim 1 wherein said visual editing 
interface provides at least one editing function to a user, the editing function enabling 
modification of said speech feature associated with said segment through a change in 
the corresponding said graphical indicator. 
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3. (Original) The system of Claim 2 wherein said visual editing interface 
associates said speech feature corresponding to said segment with said graphical 
indicator, wherein the user's modification of said graphical indicator results in a 
corresponding change in said speech feature of said segment. 

4. (Original) The system of Claim 1 wherein said speech feature is at least 
one of the following: normalized text, part-of -speech, parsing of text, chunking of text, 
boundary strength, pause duration, transcription, speech rate, syllable duration, 
segment duration, pitch, word prominence, emphasis, formant mixing mode, unit 
selection override, intensity contour, formant trajectories, and allophone rules. 

5. (Original) The system of Claim 1 wherein said graphical indicator 
comprises at least one of the following; graphical style, font faces, coloring, vertical 
spacing, horizontal spacing, italicization, boldness, underlining, blinking, crossing-out, 
text orientation, text rotation, punctuation symbols and graphical symbols. 

6. (Original) The system of Claim 1 wherein said processed representation 
employs a parameterized aligned sound records format. 

7. (Original) The system of Claim 1 wherein said segment comprises at least 
one of the following: word, letter, syllable, pause, word boundary and punctuation-mark. 
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8. (Original) The system of Claim 1 wherein said visual editing interface 
operates as a plug-in for a graphical user interface. 

9. (Original) The system of Claim 8 wherein said plug-in is an ActiveX 
control. 

10. (Previously Presented) The system of Claim 1 wherein said visual editing 
interface allows definition of said input-text by providing a set of text messages 
containing non-editable text and editable blank slots into which at least part of said 
input-text can be entered. 

1 1 . (Original) The system of Claim 1 wherein said visual editing interface is 
language independent. 

12. (Original) The system of Claim 1 wherein said visual editing interface 
provides the user with speech audio output of said processed representation. 

13. (Original) The system of Claim 1 wherein visual editing interface is 
connected to a data-store for storing and retrieving said representation. 

14. (Previously Presented) The system of Claim 1 wherein said processed 
representation is a modified textual representation of the processed input-text. 
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15. (Previously Presented) The system of Claim 14 wherein the input text is 
used to generate said processed representation. 

16. (Previously Presented) The system of Claim 15 wherein said modified 
textual representation is stored and accessed from a data store. 

17. (Previously Presented) The system of Claim 14 wherein said modified 
textual representation is used to generate synthesized speech using a TTS system 
distinct from said text-to-speech engine. 
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18. (Previously Presented) A system for providing a text-to-speech interface, 
the system comprising: 

a visual interface connected to a text-to-speech engine; and 
at least one communication channel connecting said visual interface to said text- 
to-speech engine, said text-to-speech engine communicating with said visual interface 
over said communication channel by sending and receiving at least one data segment 
in a format, 

wherein said visual interface communicates variations in one or more types of 
speech features associated with segments of said data by varying visual display 
properties of the segments, at least one of said speech features includes boundary 
strength and pause duration, and said visual display properties are applied to at least 
one of (a) a displayed boundary between adjacent segments and (b) spacing between a 
segment and said displayed boundary. 

19. (Original) The system of claim 18 wherein said format of said data 
segment is a parameterized aligned sound records format. 

20. (Currently Amended) The system of claim [[18]]19 wherein said text-to- 
speech engine sends said data segment in the parameterized aligned sound records 
format to said visual interface, said visual interface rendering said data segment in a 
visual form, said visual interface allowing editing of said data segment to produce an 
edited data segment, said visual interface sending said edited data segment to said 
text-to-speech engine. 
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21 . (Original) The system of claim 1 8 wherein said visual interface sends data 
to said text-to-speech engine over a first said communication channel and said text-to- 
speech engine sends data to said visual interface over a second said communication 
channel. 



Serial No. 10/776,892 



Page 7 of 14 



22. (Previously Presented) A method for visual tuning text-to-speech 
conversion process, the method comprising: 

converting an input-text to a processed representation using a text-to-speech 
engine, said processed representation including at least one speech feature of said 
input-text; 

displaying said processed representation on a visual editing interface connected 
to said text-to-speech engine, said speech feature of said processed representation 
being displayed in a corresponding graphical form; 

communicating variations in one or more types of speech features associated 
with segments of said representation by varying visual display properties of the 
segments, wherein said speech features include boundary strength and pause duration, 
and said visual display properties are applied to at least one of (a) a displayed boundary 
between adjacent segments and (b) spacing between a segment and said displayed 
boundary; and 

providing an editing function in said visual editing interface to a user for modifying 
said speech feature in said graphical form. 

23. (Original) The method of Claim 22 further comprising: 

generating speech audio equivalent of said processed representation through 
said visual editing interface. 

24. (Original) The method of Claim 22 further comprising: 
saving said processed representation in a data store; and 
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loading said processed representation stored in said data store into said visual 
editing interface. 

25. (Previously Presented) The method of Claim 22 further comprising: 
converting said processed representation into a modified textual representation 

of the processed input-text. 

26. (Previously Presented) The method of Claim 25 further comprising: 
converting said textual representation into a processed representation, 

wherein the input text is used to generate said processed representation. 

27. (Previously Presented) The method of Claim 25 further comprising: 
storing said modified textual representation in a data store; and 

loading said modified textual representation stored in said data store into said 
visual editing interface. 

28. (Previously Presented) The method of Claim 25 further comprising: 
using said modified textual representation to synthesize speech using a TTS 

system distinct from said text-to-speech engine. 

29. (Previously Presented) The system of claim 1 , wherein said visual editing 
interface displays a modified textual representation of said text-input, and variations in 
visual display for communicating different speech features individually associated with 
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different textual segments of the textual representation include a combination of at least 
two of: (a) variations in graphical length of the textual segments; (b) variations in 
vertical positions of the textual segments; (c) variations in horizontal spacing of the 
textual segments; (d) variations in font faces of the textual segments; (e) variations in 
coloring of the textual segments; (f) variations in styles of the textual segments; (g) 
variations in orientation of the textual segments; (h) variations in rotation of the textual 
segments; or (i) punctuation of the textual segments. 
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